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With the advancement of digital image processing in agriculture and crop 
cultivation, imaging techniques are adopted to acquire real-time health 
status. Out of all the parts of plants, the leaf is the direct indicator of its 
health status, and hence applying various image processing approaches could 
benefit the process of yielding informative cases of plant health. At present, 
there are various approaches, e.g., feature extraction, segmentation, 
identification, the classification being evolved up with more dependencies 
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1. INTRODUCTION 

With the increasing population score, there is an increasing demand for quality nutrition from a 
higher grade of food quality. This can only be confirmed when the cultivation is done because a healthy 
atmosphere and crops offer higher quality grains. However, diseases in the plants are inevitable, and it offers 
potential degradation towards the quality of the grains. Therefore, there is a higher degree of concern towards 
the diseases that inflict plants resulting in minimal production and yield. Such infliction can occur in any 
plants, right from roots, stems, branches, buds, flowers and leaves. There could be multiple reasons for this 
viz. adverse climatic condition, degraded quality of soil, poor irrigation, inferior practices in farming and 
adoption of conventional methods. With the increasing technology usage towards agriculture and cultivation, 
there are more chances of higher and quality yield in plants [1]-[5]. Sensors can be deployed over the 
cultivation fields to extract various data associated with the plants [6], [7]. Sensors can capture the images of 
plants that can be transmitted to another end, where an image processing algorithm can be executed to find 
the real-time status of the plant’s health. It can be said that the majority of the diseases that have a negative 
impact on crop yield are highly visible, and this can be an identifier for image processing algorithms. Hence, 
an algorithm can be constructed based on formulated identifiers matching the specific information about the 
disease. A human can also assess such visual information in identifying the disease condition in plants. 
However, a human cannot monitor this abnormality for the crop field of a larger dimension. Therefore, there 
is a need for an automated approach that can prevent human interaction from carrying out this task of 
identification. As most of the problems associated with the disease are visually seen; therefore, this 
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information can be checked autonomously by the machine itself using image processing. However, there are 
various challenges involved in this process. The first challenge is to perform extraction of features from 
many crops, and hence feature extraction is one essential operation [8]. There are various feature extraction 
mechanisms using various features viz. shape, energy, entropy, contrast, texture, fractal dimension, variation 
in red green blue (RGB) color, descriptors, grey levels histograms. Extensive work has been carried out 
towards this in the existing system. These extracted features are then used for further processing in order to 
carry out identification. In this process, two more processes evolve viz. identification and classification. The 
identification process generally attempts to match the defined data of disease with the input image, while 
classification generally implements a machine learning approach [9], [10] to categorize the types of disease. 
However, all these approaches are used for disease detection in plant leaf images and are also deployed for 
other purposes. 

This paper offers a snapshot of effectiveness in existing approaches towards applying computer 
vision over-identifying abnormalities in crop cultivation. This paper's organization is: section 2 discusses the 
diseases in plant leaves, followed by a discussion about existing approaches in section 3. This section further 
discusses all essential approaches and highlights their strength and weakness. Briefing of the existing trend of 
research is shown in section 4. Section 5 discusses the open-end issues about the existing studies while 
summarizing this paper in section 6. 

Diseases in plant leave, before implying the image processing domain over the plant leaf's images, it 
is essential to understand the logical information about the disease associated with plants. Table 1 highlights 
some of the frequently studied diseases in plants from their leaf concerning disease representation, its 
corresponding pathogens, color space, and classifier used to categorize it. Hence, the infection in plants can 
occur by various means, and the leaf is one of the prominent areas of its diagnosis. The information stated in 
Table | shows various ways to analyze the disease and identify various pathogens. Understanding this fact is 
essential in order to develop a sufficient identification and classification approach. Such diseases can occur 
due to various reasons viz. nutrient deficiency, fungus, insect, and bacteria as shown in Figure 1. This 
information assists in developing a better classification process. 


Table 1. Existing methods 


Plant type Representation of disease Pathogen Color space Classification 
Pea/bean [11] Changes in the color ofa leaf Deficiency of nutrient HIS Spatial (Euclidean) 
Maize [12] Color of the leaf, changes in Shealth blight, leaf YCbCr Neural network (Backpropagation) 
morphology blight 
Rice [13] -do- Deficiency of nutrient RGB Multilayer perceptron 
Grapes [14] Color of leaf, pigmentation Pest RGB Neural network 
Cotton [15] Strikes, Stains, spots Bacteria, bug hue, Support vector machine 
saturation, 
value (HSV), 
RGB 
Soyabean [16] Spot in leaf Fungus HIS The ratio of a spot with the lesion to 
the area of leaf 
Maize [17] Chlorotic area Maize streak virus Greyscale Thresholding of Pixel 
Maize [18] Holes in leaf Fall armyworm Greyscale Thresholding 


Nutrient 
Deficiency 


Reasons for Diseases in Plant 
Leaves 


Bacteria 


Figure 1. Factors for infection in plants 
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The bacterial infection is identified from various external visuals, e.g., brown spot, soft rot, blight, 
Myrothecium and mould. The fungal infection identification can be found by leaf spot, red spot, stripe, rust, 
black spot and scab. However, it is not the same for nutrient deficiency as such effect may be visually seen in 
leaves, and sometimes it may be internal. Nutrient deficiency affects the complete plant and is much adverse 
compared to other diseases found from leaves. The diseases caused due to insects can be visually identified 
over leaves. All this information is technically required to formulate the criteria of detection and 
classification in plant leaves. Out of all the different approaches, the existing approaches are mainly found to 
tend to adopt machine learning approaches, especially deep learning techniques. There are specific reasons 
for this viz. i) adoption of deep learning allows effective handling of the data in the presence of a higher 
degree of noises; ii) adoption of deep learning also led to higher accuracy over the test images and it tends to 
reduce the dependency over a greater number of the test image; iii) identification criteria of complex types 
can be well formulated in in-depth learning approach; and iv) deep learning also led to an evolution of a 
predictive approach. Some studies have used the statistical inference approach towards disease identification 
over plant leaves. 

Therefore, there are various approaches at present which is used for the identification of the diseases 
from plant leaves. While some processes address one specific technique and other amalgamate multiple 
existing approaches to achieve the objective of disease identification from plant leaves. The next section 
discusses the researchers’ contribution more vividly in recent times associated with implying digital image 
processing over plant leaves. 

At present, various work is being carried out towards investigating the specific problems associated 
with plant leaves using image processing. For this purpose, a thorough check is carried out towards various 
reputed journals to find all the signification processing techniques applied over plant leaves as shown in 
Table 1. It was found that such forms of investigation are carried out for two purposes viz. for disease 
identification and for other purposes, which is application-specific. All the journals published within the last 
decade have been considered for the review process. The prime target is to understand the strength and 
weaknesses of image processing approaches towards plant leaves as a case study. Following is the briefing of 
approaches. 

The first set of the approach is related to segmentation, which is used for differentiating foreground 
object from the background scene. This process is utilized for localizing the object for a given scene. A 
unique approach of automated segmentation is introduced by Janssens ef al. [19], where the system segment 
leaves from plants and obtains information about the leaf's symmetry line. The disease factor can be 
identified from this line of symmetry. This work's contribution is to use a unique feature extraction where 
image moments are extracted based on contours. The technique is found to offer a parallel process of 
multiple images at the same time. Another essential segmentation approach is introduced by Sun et al. [20], 
where multiple regression of linear form is used. 

The second approach is related to feature extraction, which obtains numerical features from the raw 
image, making the image suitable for further processing without losing any significant information. Another 
benefit of performing feature extraction is dimensional reduction. The process of feature extraction has been 
carried out over plant leave where essential disease spot is identified in the process. The work carried out by 
Li et al. [21] has discussed histogram-based segmentation using an evolutionary approach where the gray 
portion of the leaves represents disease spot. The study has used statistical and visual features for spotting the 
lesion location. The technique uses a genetic algorithm to carry out segmentation; however, increasing the 
number of images will also degrade the search efficiency. 

Moreover, this work does not support persistent feature extraction if the environment is dynamic. 
This problem is addressed in Lv et al. [22], where maize features are extracted in a complex environment 
using a neural network. The issues associated with the overfitting of a neural network are addressed using 
batch normalization. A similar direction of work is also carried out by Sun et al. [23], where deep learning is 
used along with the fusion of multi-scale form features. This mechanism also carries out preprocessing over 
the dataset, where the Retinex-based approach is used. The study performs fusing of low- and high-level 
features in order to accomplish better accuracy performance. 

The third and most frequently used approach is the classification method towards identifying the 
disease in a plant leaf. It has been seen that machine learning is the dominant method for addressing 
classification related issues. The implication of machine learning was reported to identify the leaf rust 
disease, as discussed in the work of Ashourloo et al. [24]. The technique evaluates the spectra of leaf image 
using electromagnetic region where multiple machine learning approach has been assessed viz. gaussian 
regression process, support vector regression, and regression using partial least square. The study outcome 
shows the robustness of the machine learning approach. Although this technique offers robustness in the 
learning mechanism, it still induces higher memory consumption, increasing training operation over different 
forms of images. Hence, a better version of optimization is required to improve the classification 
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performance further. A study to achieve this goal is reported to be accomplished using a bacterial foraging 
algorithm when integrated with the frequently used neural network with radial basis function. This work is 
discussed by Chouhan ef al. [25], where bacterial foraging is used for allocating enhanced weight towards the 
radial basis function. This phenomenon is proven to increase accuracy. The study has considered the 
identification and classification of fungal disease on the leaf where higher accuracy performance is noted. 
However, the study is carried out over pre-trained images whose impact relationship with image quality 
cannot be identified. This issue is considered in Dai et al. [26], where it is stated that a fusion-based approach 
with a generative network could yield better results. 

The idea of this study is also to obtain an image with a higher resolution. This technique's positive 
point is its ability to detect multiple diseases, but its dependencies on the texture information are not found 
within the dataset. The prime reason for this is the non-inclusion of vegetation indices. Even this index is 
considered lower than its computational capability to identify variations in features in different diseases. This 
problem is addressed by Huang et al. [27], where vegetation indices have been substituted by spectral indices 
obtained from wheat's hyperspectral image. The study obtains weight information from bands of highly 
correlated wavelength in order to generate spectral indices. The outcome shows better detection performance. 
Adoption of hyperspectral images and disease detection is also seen in the work of Moriya ef al. [28], 
considering the case study of sugarcane plants. The study develops a library that consists of hyperspectral 
image of both healthy and unhealthy plants. The technique further uses the block of the radiometer to obtain 
specific features of reflectance for generating mosaic for identification. Further kappa statistics are used for 
the classification process. The work carried out by Jiang et al. [29] have considered the case study of the 
identification of disease in apple leaves using enhanced convolution neural network on a real-time basis. 

To some extent, such adoption of a rule-based approach offers better optimization enhancement in 
handling such issues. The work in this direction has been noted by Kaur et al. [30], where a rule-based 
approach is used for performing classification operation considering the case study of the classification of 
leave disease in the soybean plant. Using the k-mean algorithm, the study uses multiple forms of features 
(e.g., texture and color.) where the training operation is carried out using a support vector machine. Pham et 
al. [31] carried out a study towards the early classification process, emphasizing the heuristic-based solution 
using a hybrid approach. The idea is to implement a forward feed network to identify small diseases in plants. 
The study also performs contrast enhancement as a preprocessing step followed by segmenting blob. The 
neural network takes the input of the feature to carry out training and later the classification. The adoption of 
deep learning is also seen in Wang et al. [32], where the automated process is used to assess the severity of 
the disease in plants. This classification approach has used segmentation using thresholding and used feature 
engineering. A similar approach to detecting severity is used by Zeng et al. [33] using deep learning. The 
comparesion of strength and weakness for different authors is as shown in Table 2. 


Table 2. Summary of the effectiveness of existing approache 


Authors Problems Strength Weakness 
Janssens et al. [19] Segmentation Parallel processing Uses high-end processor to get results 
Sun et al. [20] Segmentation Reliability, higher precision -Accuracy can be furthermore optimized 
-Uses high-end processor to get results 
Li et al. [21] Feature extraction Enhance the efficiency of search Does not support scalable performance 
Lv et al. [22] Feature extraction Optimal learning process Specific to disease 
Sun et al. [23] Feature extraction Higher precision Specific to disease 
Ashourloo et al. [24] Classification Robustness learning method Could induce computational complexity 


2. RESEARCH TRENDS 

Understanding the research trend is essential to visualize the direction of the ongoing research in the 
area of analyzing disease from the plant leaves. For this purpose, all the research papers that are published 
during 2010-2020 are collected from reputed publication and reviewed. The analysis shows various explicit 
trends which are discussed in this section. 


3. ANALYSIS OF EXISTING METHODS 

At present, there are many existing methods that has been formed in order to address the issues 
associated with applying digital image processing over plant leaves. An outcome shown in Figure 2 shows 
that majority of the work has been carried out towards feature extraction process. Hence, the emphasis is 
more offered toward feature extraction process as essential information obtained from the extracted feature is 
highly contributory towards detection and classification. The adoption of deep learning, neural network, and 
support vector machines are next trend observed. This eventually means that more work is carried out 
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towards classification process using machine learning. However, segmentation being an important part of 
image processing has seen a smaller number of attention and similar trend is found for regression. The rule- 
based method of fuzzy logic is also found to be very less adopted in existing approaches. Hence, the outcome 
of this inference is more usage of feature extraction methods and machine learning are frequently adopted 
topic of research in present era. 
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Figure 2. Trends towards existing methods 


4. ANALYSIS OF ACCURACY ACHIEVED 

Accuracy is one of the essential parameters for identification as well as classification. Exploring the 
research trend toward accuracy is essential to understand the achievement level, signifying the strength of 
existing approaches. Figure 3 showcase that means of the accuracies achieved from all research work 
published between 2011 to 2020. A closer look into the trend shows a steep fall of accuracy from 2011 to 
2014 where the researchers have focused on different sets of problems where accuracy is considered a 
secondary parameter of performance. The different sets of problems include addressing testing with multiple 
forms of images, checking the visual quality, and considering parameters of training and run time of the 
algorithm. However, the accuracies started increasing from 2015 onwards stating that researchers are 
prioritizing the accuracy as the essential parameters of their experiments. The graphical outcome of trend also 
shows that accomplishment of accuracy remains nearly linear during 2018-2020. 
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Figure 3. Trends towards accuracy obtained 
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5. ANALYSIS OF SCALABILITY 

A scalability is assessed by evaluating the score of accuracy over increasing number of images 
dataset. It basically shows how a uniform algorithm could offer consistency of accuracy when exposed to 
different forms of images. The outcome shown in Figure 4 highlights that there are 55% of consistency in 
accuracy when exposed to increasing number of different forms of images. Therefore, it can be said that 
existing approaches are required to work more towards increasing this scalability factor and they are yet nor 
ready to be deployed for commercial application which demands much better consistency in accuracy. 
Hence, there is a need of more experiments and a greater number of modellings to evolve up with a new 
solution towards working on this scalability factor of accuracy. 
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Figure 4. Trends towards accuracy concerning images 


6. OPEN END RESEARCH PROBLEMS 

Open-end research problems: There are very few works being carried out towards preprocessing the 
input of plant leaf images. The preprocessing challenges in analyzing the plant's disease by the computer 
vision systems face some unique challenges compared to other image processing tasks of preprocessing. 
These challenges include viz. i) the occlusion on the leaves in the original images contain overlapping 
conditions that are a constitute of the noise; ii) the contrast mismatch and its balance among the fruit, its 
leaves, and background require efficient adjustment; iii) the highly varying lighting condition in the real-time 
scenario as the weather and sun variation brings associated challenges as well as due to the density factor of 
the orchard there exit low-intensity effect on the input image data; and iv) the imaging modality consists of 
large numbers of correlated but redundant feature set as an outcome of the various wavelength bands imaging 
modality. 

Although studies are carried out in segmentation, it has not addressed the fundamental challenges 
within it. The challenges during the process of separating the disease affected portion from the background of 
the image include the following challenges: i) handling the dynamics of color associated with the disease 
during the segmentation based on the color feature-set; ii) handling the dimension overhead due to large color 
features of the original color of the fruits; and iii) variations of light condition, dimension or spread size of 
the disease, volume of the fruits are another essential challenges to be considered while designing the 
effective segmentation algorithms. Another important aspect towards the challenges to be handled during the 
segmentation process is that the popular region growing algorithm consumes more time, so the time overhead 
is not suitable to build the method suitable for the real-time process as well as consideration of the 
localization of the diseases affected area, its geometric features and textures require effective descriptors and 
extractor to get a better feature set for the learning model. Therefore, open-end research problems can be 
summarized as: 

a. The sensor and advance imaging systems pose enormous complexities for the deployments as they are 
not cost-effective. 

b. Many of the methods consider only the accuracy as performance parameters and do not consider metrics 
like F1l-Score to handle the trade-off between accuracy and precision. 
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Much of the work in disease detection focuses on the classification part but significantly less work on 
an integrated framework considering all aspects like preprocessing, segmentation, feature extraction, 
and learning model to provide an overall better sensitivity and specificity results along with the 
computation and time complexities trade-offs. 

For the model validations, most models benchmark their work or optimize the model only for accuracy. 
In contrast, apart from the specificity and sensitivity analysis, the focus should also minimize the 
computational and time complexities to move the evolution towards real-time implementations. 

The literature lacks significant optimization method inclusions. 


CONCLUSION 
This paper has discussed an essential approach used to identify the abnormalities in crop cultivation 


using plant leaves' image. With a certain amount of strength in existing techniques of image processing, there 
is a more significant number of challenges associated with it. Hence, our future work will be focused on 
addressing the identified challenges and open-end research problems. The study will be carried out toward 
developing an evaluation framework in order to offer better performance. 
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